On Suboptimal Alignments of Biological Sequences

نویسندگان

  • Dalit Naor
  • Douglas L. Brutlag
چکیده

It is widely accepted that the optimal alignment between a pair of proteins or nucleic acid sequences that minimizes the edit distance may not necessarily re ect the correct biological alignment. Alignments of proteins based on their structures or of DNA sequences based on evolutionary changes are often di erent from alignments that minimize edit distance. However, in many cases (e.g. when the sequences are close), the edit distance alignment is a good approximation to the biological one. Since, for most sequences, the true alignment is unknown, a method that either assesses the signi cance of the optimal alignment, or that provides few \close" alternatives to the optimal one, is of great importance. A suboptimal alignment is an alignment whose score lies within the neighborhood of the optimal score. Enumeration of suboptimal alignments [Wa83, WaBy] is not very practical since there are many such alignments. Other approaches [Zuk, Vi, ViAr] that use only partial information about suboptimal alignments are more successful in practice. We present a method for representing all alignments whose score is within any given delta from the optimal score. It represents a large number of alignments by a compact graph which makes it easy to impose additional biological constraints and select one desirable alignment from this large set. We study the combinatorial nature of suboptimal alignments. We de ne a set of \canonical" suboptimal alignments, and argue that these are the essential ones since any other suboptimal alignment is a combination of few canonical ones. We then show how to e ciently enumerate suboptimal alignments in order of their score, and count their numbers. Examples are presented to motivate the problem. Since alignments are essentially (s; t)-paths in a directed acyclic graph with (possibly negative) weights on its edges, our solution gives an extremely simple method to enumerate all K shortest (or longest) paths from s to t in such graphs in increasing order, as well as all (s; t) paths that are within of the optimum, for any . We compare this solution with known algorithms that nd the K-best shortest paths in a graph. ? Supported by a Postdoctoral Fellowship from the Program in Mathematics and Molecular Biology of the University of California at Berkeley, under National Science Foundation Grant DMS-9720208

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enumerating suboptimal alignments of multiple biological sequences efficiently.

The multiple sequence alignment problem is very applicable and important in various fields in molecular biology. Because the optimal alignment that maximizes the score is not always biologically most significant, providing many suboptimal alignments as alternatives for the optimal one is very useful. As for the alignment of two sequences, this suboptimal problem is well-studied, but for the ali...

متن کامل

Enumerating Suboptimal Alignments of Multiple Biological Sequences E ciently

The multiple sequence alignment problem is very applicable and important in various elds in molecular biology. Because the optimal alignment that maximizes the score is not always biologically most signi cant, providing many suboptimal alignments as alternatives for the optimal one is very useful. As for the alignment of two sequences, this suboptimal problem is well-studied 6;9;12 , but for th...

متن کامل

A New Approach to Suboptimal Pairwise Sequence Alignment

In comparative protein modeling, the quality of a template model depends heavily on the quality of the initial alignment between a given protein with unknown structure to various template proteins, whose tertiary structure is available in the Protein Data Bank (PDB). Although pairwise sequence alignment has been solved for more than three decades, there remains a large discrepancy between the a...

متن کامل

Molecular analysis of AbOmpA type-1 as immunogenic target for therapeutic interventions against MDR Acinetobacter baumannii infection

Introduction: Acinetobacter baumannii is associated with hospital-acquired infections. Outer membrane protein A of A.baumannii (AbOmpA) is a well-characterized virulence factor which has important roles in pathogenesis of this bacterium. Methods: Based on our PCR-sequencing of ompA gene in the clinical isolates, AbOmpA protein can be categorized into two types, named here type-1 and type-2. We ...

متن کامل

Computing all Suboptimal Alignments in Linear Space

Recently, a new compact representation for suboptimal alignments was proposed by Naor and Brutlag (1993). The kernel of that representation is a minimal directed acyclic graph (DAG) containing all suboptimal alignments. In this paper, we propose a method that computes such a DAG in space linear to the graph size. Let F be the area of the region of the dynamic-programming matrix bounded by the s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993